Hi, we're going to conclude this course with our last chapter, and this last chapter is
about statistical inversion, which is a completely different approach to inverse problems coming
from the statistical community, which is a slightly different method. And the guiding
example for this chapter will be the following inverse problem, which is a very easy inverse
problem. So we consider the inverse problem y is equal to x plus epsilon. So this is the
most, the simplest version of an inverse problem that we can have. We are observing the parameter
directly. The only factor that is making problems here is some additive noise. And let's now
assume that epsilon is, let's say, Gaussian noise. The exact distribution is not important,
but the magnitude, let's say, is given by sigma. So the magnitude of epsilon is sigma plus
minus sigma, plus minus two sigma maybe, but that's roughly the order of magnitude of this
noise here. And well, obviously the best reconstruction that we can get from the data is x star equal
to y. So if you were to analyze the system with the basic tools, which is a bit overkill
here. So for example, the minimum norm solution would exactly be x star equal to y. Or the
truncated SVD would also be that unless you truncate so much that it's equal to zero.
So there's nothing really happening here. The best reconstruction is just taking the
data and using this as a surrogate for the parameter. And why is that? Well, epsilon is
symmetric. So let's make a quick sketch while this makes sense. So this is the parameter,
the line where the parameter can live. Let's say the true parameter is here. So actually
true parameter. Then we have some distribution, right? This is the distribution of the noise.
And this lets us pick some, so the parameter will be here, sorry, the data will be here,
y equal to x plus some small perturbation. And well, what can you do? You just have y.
You don't know whether y comes from this specific parameter here, the green one, or it might
also with equal probability, so to speak, could have been, let's say this parameter,
let's call it x prime, which has the same symmetric distribution, which would lead to
the same y if the noise would have been exactly the opposite of the one we actually have.
So there's symmetry in here because the noise is symmetrical. This y here could have arisen
from, you know, we could also lie a curve on here, which is the set of, let's say plausible
parameters which might have given us this parameter. So we don't know which to pick,
whether this x, the green x, which is the true one, or x prime, or anything in between,
we can plot the so-called likelihood function. The likelihood function is the probability
distribution of y given x, and this is a function of x. So y is given, we know what y is, now
we vary x and we look at the probability that we have gotten the data y if x would have
been the correct parameter. So this is a function, we can write this down because we know the
distribution of the noise, so this is 1 over 2 pi sigma squared times e to the minus y
minus x squared divided by 2 sigma squared. We will be going into more detail why this
is true, but for now just indulge me. And if we plot this, this function, this likelihood
function, where remember y is given, we're just varying x, then this would roughly look
like this function. So the most likely parameter which might have yielded the data y would
have been exactly the choice of x equal to y, this has the highest likelihood in the
sense of this function. And there's a small range of parameters around this y which are
also plausible, for example this true parameter which we don't know actually, or this symmetrically,
the other side of this y, this x prime which is also plausible, but the most plausible,
the most likely parameter is just taking the data y as a parameter. So this is why we say
that this is the best reconstruction. This matches our minimum norm solution because
it's a least square reconstruction, but we can also think about this in terms of probability.
And x star here is the most likely parameter. The most likely, but you know, what does likely
mean? Likely in the sense that it maximizes the likelihood function. Most likely a parameter
because, well, x star is, as you can see, the argmax of this p of y given x, where we
vary just x and y is fixed. It's this bell-shaped curve and the maximum of the bell-shaped curve
is attained if we take x equal to y. Okay, so hopefully this makes sense. This is the
Presenters
Zugänglich über
Offener Zugang
Dauer
00:33:41 Min
Aufnahmedatum
2022-01-25
Hochgeladen am
2022-01-25 12:16:03
Sprache
en-US